Scanner-driven email message decomposition

ABSTRACT

A method, system, and computer program product for scanning emails by reducing the amount of decomposition processing that is performed to only the minimum necessary to fully scan the emails. This reduces the server resources needed, which improves server throughput and reduces costs. A method for processing email messages comprises the steps of receiving an email message comprising a plurality of items, scanning the email message with at least one scanner software, determining with each of the at least one scanner softwares what items of the plurality of items the email message is to be decomposed into, decomposing the email message to obtain the items determined by each of the at least one scanner software.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to scanning emails using a“scanner-driven” model, in which each scanner requests the amount ofdecomposition it requires to make a decision on the binary stream.

2. Description of the Related Art

The prevalence of unsolicited commercial email, commonly known as spamhas grown rapidly and still growing. The corporate world and individualhome users are spending millions of dollars to combat spam. InternetService Providers (ISPs) have to cope with greatly increasing day-to-dayamounts of network traffic due to the increase in spam emails. If spamtraffic continues to grow, it may become unmanageable in the nearfuture.

Another common and growing problem is the spread of computer malwares. Atypical computer malware is a program or piece of code that is loadedonto a computer and/or performs some undesired actions on a computerwithout the knowledge or consent of the computer operator. The mostwidespread, well-known and dangerous type of computer malware arecomputer viruses, that is, programs or pieces of code that replicatethemselves and load themselves onto other connected computers. Once thevirus has been loaded onto the computer, it is activated and mayproliferate further and/or damage the computer or other computers.

Typically, incoming emails may be scanned for a variety of undesirablecontents. For example, emails may be scanned to determine whether or notthey are spam, whether or not they include viruses or other malware, orwhether or not they include inappropriate or other “bad” content.

Typically, spam has been fought by the use of software that scansincoming email messages to determine whether each message is spam,includes malware, or includes bad content. If so, the messages areaccordingly marked as ***SPAM*** or quarantined. When a data stream ispresented for scanning it is often a compound object such as a MIMEstream or archive file. This stream is decomposed into its constituentfiles before being presented to the AntiVirus, AntiSpam, bad conent, andother scanners. Traditionally this process has been“decomposition-driven”. That is, the binary stream is decomposed into asmany different parts as possible and then each of these parts is beenpresented to the scanners.

However, a large ISP can receive millions of emails each day, each ofwhich must be scanned. Other large organizations may receive thousand ofemails each day. On an average each mail takes from 15 milliseconds to400 milliseconds to scan for such spam content. Thus consumes a hugeamount of email server time and can in turn create a loss in theproductivity of the organization.

SUMMARY OF THE INVENTION

A method, system, and computer program product for scanning emails byreducing the amount of decomposition processing that is performed toonly the minimum necessary to fully scan the emails. This reduces theserver resources needed, which improves server throughput and reducescosts. This provides an alternative “scanner-driven” model, in whicheach scanner requests the amount of decomposition it requires to make adecision on the binary stream and no more, thus optimizing the amount ofdecomposition carried out for any one scan. Such a model is particularlyrelevant to AntiSpam scanning, where a decision can often be made beforeall possible levels of decomposition have been carried out. In moregeneral terms it is applicable when users have turned off certainscanners, such as “content”.

A method for processing email messages comprises the steps of receivingan email message comprising a plurality of items, scanning the emailmessage with at least one scanner software, determining with each of theat least one scanner softwares what items of the plurality of items theemail message is to be decomposed into, decomposing the email message toobtain the items determined by each of the at least one scannersoftware.

The method may further comprise the step of scanning the items obtainedby each of the at least one scanner softwares with that scannersoftware. The determining step may comprise the step of determining whatitems of the plurality of items the email message is to be decomposedinto based on the items of the plurality of items that a scannersoftware is capable of scanning. The at least one scanner softwares maycomprise a plurality of scanner softwares and the decomposing step maycomprise the step of decomposing the email message to obtain each itemdetermined by at least one of the plurality of scanner softwares onlyonce. The method may further comprise the step of scanning the itemsobtained by at least one of the plurality of scanner softwares with thatscanner software. The determining step may comprise the step ofdetermining what items of the plurality of items the email message is tobe decomposed into based on the items of the plurality of items that ascanner software is capable of scanning. The plurality of scanners maycomprise at least one of an anti-virus scanner, an anti-spam scanner,and a bad content scanner. The plurality of items of the email messagemay comprise at least one of a MIME stream, MIME headers, an HTML item,a ZIP item, a text item, a document, and a list of URLs. The emailmessages may be incoming email messages or the email messages may beoutgoing email messages.

BRIEF DESCRIPTION OF THE DRAWINGS

The details of the present invention, both as to its structure andoperation, can best be understood by referring to the accompanyingdrawings, in which like reference numbers and designations refer to likeelements.

FIG. 1 is an exemplary block diagram of a system in which the presentinvention may be implemented.

FIG. 2 is an exemplary block diagram of decomposition of a compounditem, which is a MIME stream.

FIG. 3 is an exemplary pseudo-code listing of a root functionimplementing scanner-driven decomposition.

FIG. 4 is an exemplary pseudo-code listing of a scanner functionimplementing scanner-driven decomposition.

FIG. 5 is an exemplary pseudo-code listing of a node functionimplementing scanner-driven decomposition.

FIG. 6 is an exemplary pseudo-code listing of a node functionimplementing scanner-driven decomposition.

FIG. 7 is an exemplary pseudo-code listing of a decomposer functionimplementing scanner-driven decomposition.

FIG. 8 is an exemplary pseudo-code listing of a decomposer functionimplementing scanner-driven decomposition.

FIG. 9 is an exemplary listing of scanners that may be used inimplementing scanner-driven decomposition.

FIG. 10 is an exemplary listing of decomposers that may be used inimplementing scanner-driven decomposition.

FIG. 11 a is an exemplary block diagram of a first stage ofdecomposition of a compound item, which is a MIME stream.

FIG. 11 b is a diagram of a first stage of an example of scanner-drivendecomposition.

FIG. 12 a is an exemplary block diagram of a second stage ofdecomposition of a compound item, which is a MIME stream.

FIG. 12 b is a diagram of a second stage of an example of scanner-drivendecomposition.

FIG. 13 a is an exemplary block diagram of a third stage ofdecomposition of a compound item, which is a MIME stream.

FIG. 13 b is a diagram of a third stage of an example of scanner-drivendecomposition.

FIG. 14 a is an exemplary block diagram of a fourth stage ofdecomposition of a compound item, which is a MIME stream.

FIG. 14 b is a diagram of a fourth stage of an example of scanner-drivendecomposition.

FIG. 15 a is an exemplary block diagram of a fifth stage ofdecomposition of a compound item, which is a MIME stream.

FIG. 15 b is a diagram of a fifth stage of an example of scanner-drivendecomposition.

FIG. 16 a is an exemplary block diagram of a sixth stage ofdecomposition of a compound item, which is a MIME stream.

FIG. 16 b is a diagram of a sixth stage of an example of scanner-drivendecomposition.

FIG. 17 a is an exemplary block diagram of a second stage ofdecomposition of a compound item, which is a MIME stream.

FIG. 17 b is a diagram of a second stage of an example of scanner-drivendecomposition.

FIG. 18 a is an exemplary block diagram of a third stage ofdecomposition of a compound item, which is a MIME stream.

FIG. 18 b is a diagram of a third stage of an example of scanner-drivendecomposition.

FIG. 19 a is an exemplary block diagram of a fourth stage ofdecomposition of a compound item, which is a MIME stream.

FIG. 19 b is a diagram of a fourth stage of an example of scanner-drivendecomposition.

FIG. 20 a is an exemplary block diagram of a fifth stage ofdecomposition of a compound item, which is a MIME stream.

FIG. 20 b is a diagram of a fifth stage of an example of scanner-drivendecomposition.

FIG. 21 is an exemplary block diagram of completed decomposition of acompound item, which is a MIME stream.

FIG. 22 is an exemplary block diagram of completed decomposition of acompound item, which is a MIME stream.

FIG. 23 is an exemplary block diagram of an email server, in which thepresent invention may be implemented.

DETAILED DESCRIPTION OF THE INVENTION

A method, system, and computer program product for scanning emailsreduces the server resources needed, which improves server throughputand reduces costs. This provides an alternative “scanner-driven” model,in which each scanner requests the amount of decomposition it requiresto make a decision on the binary stream and no more, thus optimizing theamount of decomposition carried out for any one scan. Such a model isparticularly relevant to AntiSpam scanning, where a decision can oftenbe made before all possible levels of decomposition have been carriedout. In more general terms it is applicable when users have turned offcertain scanners, such as “content”.

A block diagram of a system 100 in which the present invention may beimplemented is shown in FIG. 1. Email server 102 receives email messages104 via the Internet 106, or other unsecure network. The email messagesare processed by email scanner 108. Email scanner 108 automates thehighlighting, removal or filtering of e-mail spam, malware, and/or badcontent by scanning through incoming and outgoing e-mails in search oftraits typical of such undesirable items. Such scanning may includesearching for patterns in the headers or bodies of messages. Eachincoming email message is scanned to determine whether it is a dangerousspam email message, including malware or bad content, which is to bequarantined 110, a spam email message that is to be marked as SPAM 112and delivered to the recipients inbox 114, or a clean email message 116that is to be delivered as is to the recipient's inbox 114. Emailscanners 108 include a plurality of scanners 118A-N, each of which iscapable of scanning one or more different item types and scanning forone or more types of undesirable content. A scanner is a component thatcan run against an item to determine whether or not it has undesirablecontent, such as AntiVirus, AntiSpam, bad conent, and other scanners. Anitem is a stream of data and an item type is a category of item, such asa MICROSOFT WORD® document or a MICROSOFT WINDOWS® executable. Ascannable item type is an item type that can be scanned by one or morescanners. Note that this can include compound items. An example is aMIME stream, which can be scanned by an AntiVirus scanner, an AntiSpamscanner, a bad conent scanner, and other scanners. An item of typeunknown is an item of a type that cannot be established until the itemhas been decomposed. A compound item is an item that can be decomposedto one or more other items of type unknown. Example compound item typesinclude zip files and MIME streams (emails).

Email scanners 108 also include a plurality of decomposers 102A-M. Adecomposer is a component that can decompose items of a particular typeto one ore more constituent items. A decomposition tree is a treerepresenting the current decomposition state of a compound item, witheach node in the tree representing one item. An example of adecomposition tree 200 is shown in FIG. 2. The example of FIG. 2 showsthe decomposition of a compound item 202, which is a MIME stream.Multipurpose Internet Mail Extensions (MIME) is an Internet Standardthat extends the format of e-mail to support text in character setsother than US-ASCII, non-text attachments, multi-part message bodies,and header information in non-ASCII character sets. MIME is also afundamental component of communication protocols such as HTTP, whichrequires that data be transmitted in the context of e-mail-likemessages, even though the data may not actually be e-mail.

MIME item 202 can be decomposed into a plurality of constituent items,such as MIME headers 204, HyperText Markup Language (HTML) item 206, andZIP item 208. Thus, decomposition tree 200 includes a number ofbranches. A decomposition sub-tree is a decomposition tree that is abranch of another decomposition tree. MIME headers 204 includeinformation about MIME item 202 and about the items included in MIMEitem 202, such as HTML item 206 and ZIP item 208. Typically, adecomposer capable of decomposing MIME items will use MIME headers 204to decompose MIME item 202 into its constituent items. HTML item 206 andZIP item 208 are themselves compound items that may be decomposed intofurther constituent items. Thus, HTML item 206 and ZIP item 208 fromdecomposition sub-trees in FIG. 2.

HTML item 206 includes information in the HTML language. HTML is apredominant markup language for the creation of web pages. It provides ameans to describe the structure of text-based information in a document.HTML item 206 is a compound item that includes a plurality of items,such as text item 210. HTML denotes certain text as headings,paragraphs, lists, and so on—and to supplement that text withinteractive forms, embedded images, and other objects. Likewise, textitem 210 includes a plurality of items, such as a list of UniformResource Locators (URLs) 212. Each constituent item may be obtained bydecomposing the inclusive item with one or more decomposers.

ZIP item 208 includes information in the ZIP file format. The ZIP fileformat is a popular data compression and archival format. A ZIP filecontains one or more files or documents, such as document 214, whichhave been compressed or stored. Likewise, each document, such asdocument 214, includes constituent items, such as text item 216.Finally, text item 216 includes a plurality of items, such as a list ofURLs 216. Each constituent item may be obtained by decomposing theinclusive item with one or more decomposers.

Each item in decomposition tree 200 may be scanned by a scanner that isthe capable of scanning one or more different item types and scanningfor one or more types of undesirable content. Compound items may, insome cases, be fully scanned by a scanner. However, typically, acompound item must be decomposed into its constituent items, and theneach constituent item is scanned by the appropriate scanner. A scannerreports that it is satisfied by a decomposition tree when it has scannedthe contents of that tree to its own satisfaction at the current stateof decomposition. If all sub-trees of a decomposition tree satisfy ascanner than the decomposition tree satisfies that scanner.

Exemplary psuedo-code samples of an exemplary method of scanner-drivendecomposition are shown in FIGS. 3-8. This exemplary method isscanner-driven. This means the decomposition tree is only expanded asfar as is necessary to satisfy all sub-trees for all scanners and nofurther. Thus unnecessary decompositions are avoided. One decompositiontree is used by all scanners so no decomposition step is carried outmore than once.

In the exemplary function 300, shown in FIG. 3, each scanner is able todrive the decomposition to whatever level it requires to be satisfied bythe sub-tree. The method is driven by a recursive function 300 thattakes two parameters: a node in the decomposition tree and a list ofscanners to scan with. In step 302, each scanner in the list is calledand returns whether or not it is satisfied. If a scanner is satisfied,it is removed from the list. For the first call (on the root node), thescanner list contains all of the available scanners. In step 304, ifthere are any remaining scanners in the list, the current decompositionnode is decomposed. In step 306, the child nodes resulting from thedecomposition performed in step 304 are scanned with the scannersremaining in the list. The root call completes when all scanners aresatisfied by the whole decomposition tree.

Each scanner implements a version of the function shown in FIG. 4. Inthe function 400, shown in FIG. 4, each scanner is able to drive thedecomposition to whatever level it requires to be satisfied by thesub-tree. In step 402, function 400 determines whether the decompositionnode being processed is a type that is supported by the scanner. In step404, if the node is supported, the node is scanned, includingrecursively further decomposing the node.

Each node in the decomposition tree supports the functions shown inFIGS. 5 and 6. The function 500, shown in FIG. 5, attempts to decomposeone more level in the sub-tree to the specified type by, in step 502,invoking each decomposer in turn. The function 600 shown in FIG. 6attempts to establish the type of the node by, in step 602, invokingeach decomposer in turn.

Each decomposer supports the functions shown in FIGS. 7 and 8. In thefunction 700, shown in FIG. 7, the decomposer, in step 702, establisheswhether the given item is of a type it recognizes. In the function 800,shown in FIG. 8, the decomposer carries out its decomposition, first, instep 802, determining whether it supports either node, then, in step804, creating additional nodes in the decomposition tree as required.

Examples of scanners that may be used, and their characteristics, areshown in FIG. 9. For example, scanner 902 may be an anti-virus scanner,an anti-spam scanner, a bad content scanner, etc. Each scanner hasassociated item types that may be satisfied by a scan 904, such asdocuments, HTML items, MIME items, text items, etc. Likewise, eachscanner has associated item types that it can scan 906, such as MIMEitems, documents, HTML items, MIME headers, text items, lists of URLs,etc.

It is to be noted that the scanners shown in FIG. 9 are merely examples.The present invention contemplates use with any type of scanner, andscanners capable of scanning any type of item.

Examples of decomposers that may be used, and their characteristics, areshown in FIG. 10. For example, decomposer 1002 may decompose MIME items,ZIP items, HTML items, text items, documents, etc. Each decomposer hasassociated item types that may be decomposed from 1004, such as MIMEitems, ZIP items, HTML items, text items, documents, etc. Likewise, eachscanner has decomposer item types that it can decompose items to 1004,such as MIME headers, HTML items, unknown items, text items, lists ofURLs, etc.

It is to be noted that the decomposers shown in FIG. 10 are merelyexamples. The present invention contemplates use with any type ofdecomposer, and decomposer capable of decomposing any type of item.

An example of processing of a data stream using scanner-drivendecomposition is shown in FIGS. 11-16. This example assumes that thedata stream contains no viruses, spam or bad content. The example isbest viewed in conjunction with the decomposition tree 200, shown inFIG. 2. At the first stage of the decomposition example, thedecomposition tree includes only MIME item 202, as shown in FIG. 11 a.Turning to FIG. 11 b, it is seen that the unsatisfied scanners at thebeginning of this stage 1102 include the anti-virus scanner, theanti-spam scanner and the bad content scanner. The actions taken by eachscanner at this stage 1104 are that the anti-virus scanner recognizesthe item as MIME and scans it, the anti-spam scanner recognizes the itemas MIME and begins to scan by performing a top level decomposition, andthe bad content scanner is not run as it cannot handle MIME items. Theresult of this stage 1106 is that all scanners are unsatisfied.

At the second stage of the decomposition example, the decomposition treeincludes MIME item 202, MIME headers 204 and HTML item 206, as shown inFIG. 12 a. Turning to FIG. 12 b, it is seen that the unsatisfiedscanners at the beginning of this stage 1202 include the anti-virusscanner, the anti-spam scanner and the bad content scanner. The actionstaken by each scanner at this stage 1204 are the anti-spam scannerdecomposes the MIME headers node 202 and scans it. As is cannotdetermine whether the mail is spam on this basis alone it decomposes tothe HTML node 206 and scans that. The bad content scanner is not run asit cannot handle MIME items and the anti-virus scanner is also not run.The result of this stage 1206 is that all scanners are unsatisfied.

At the third stage of the decomposition example, the decomposition treeincludes MIME item 202, MIME headers 204, HTML item 206, text item 210and list of URLs 212, as shown in FIG. 13 a. Turning to FIG. 13 b, it isseen that the unsatisfied scanners at the beginning of this stage 1302include the anti-virus scanner, the anti-spam scanner and the badcontent scanner. The actions taken by each scanner at this stage 1304are the anti-spam scanner is still not able to complete and sodecomposes and scans body text and URLs. It has now established that theMIME message is not spam and so is satisfied by the whole decompositiontree. The bad content scanner and the anti-virus scanner are not run.The result of this stage 1306 is that the anti-spam scanner is satisfiedand the bad content scanner and the anti-virus scanner are unsatisfied.

At the fourth stage of the decomposition example, the decomposition treeincludes MIME item 202, MIME headers 204, HTML item 206, text item 210and list of URLs 212, as shown in FIG. 14 a. Turning to FIG. 14 b, it isseen that the unsatisfied scanners at the beginning of this stage 1402include the anti-virus scanner and the bad content scanner. The actionstaken by each scanner at this stage 1404 are the anti-virus scannerscans the MIME headers item and, as it is not interested in that type,reports that it is not satisfied by it. The bad content scanner scansthe MIME headers item and reports that it is satisfied by it. Theanti-virus scanner scans the HTML and reports that it is satisfied byit. The bad content scanner reports that is not satisfied by HTML. It isthen presented with the Text node, which it scans, and reports that itis satisfied by. Note that this entire step does not involve any newdecompositions. The result of this stage 1406 is that the bad contentscanner is satisfied and the anti-virus scanner is not satisfied.

At the fifth stage of the decomposition example, the decomposition treeincludes MIME item 202, MIME headers 204, HTML item 206, text item 210,list of URLs 212, ZIP item 208 and document 214, as shown in FIG. 15 a.Turning to FIG. 15 b, it is seen that the unsatisfied scanners at thebeginning of this stage 1502 include the anti-virus scanner and the badcontent scanner (which is not satisfied now that additionaldecomposition has occurred). The actions taken by each scanner at thisstage 1504 are the anti-virus scanner decomposes the ZIP item 208 andscans the document 214 and is satisfied by it. The bad content scannerdoes not handle these types of items and so is not satisfied. The resultof this stage 1506 is that the bad content scanner is not satisfied andthe anti-virus scanner is satisfied.

At the sixth stage of the decomposition example, the decomposition treeincludes MIME item 202, MIME headers 204, HTML item 206, text item 210,list of URLs 212, ZIP item 208, document 214, and text item 216, asshown in FIG. 16 a. Turning to FIG. 16 b, it is seen that theunsatisfied scanners at the beginning of this stage 1602 include onlythe bad content scanner. The actions taken by each scanner at this stage1404 are that the bad content scanner decomposes document 214 to textitem 216, scans it and is satisfied. All scanners are now satisfied forall subtrees of MIME and therefore the scan is complete. The result ofthis stage 1606 is that all scanners are satisfied.

The example shown in FIGS. 11-16 shows how the method can scan adecomposition tree using three scanners without performing anydecomposition steps more than once. However, virtually the entiredecomposition tree is expanded (only the final URL list step isavoided). In the example shown in FIGS. 17-20, it is assumed that theMIME message is a spam than can be detected as such purely on the basisof its headers. Stage 1 is as shown in FIG. 11. From there the scanproceeds with stage two, shown in FIGS. 17 a and 17 b. At the secondstage of this example, the decomposition tree includes MIME item 202 andMIME headers 204, as shown in FIG. 17 a. All scanners are initiallyunsatisfied 1702, as shown in FIG. 17 b. The actions taken 1704 are thatthe anti-spam scanner decomposes the MIME Headers and scans them. Onthis basis it is able to determining that the mail is spam and completesits scan without any further decomposition. The anti-virus scanner andthe bad content scanner are not run. The result 1706 is that theanti-spam scanner is satisfied by the MIME items, and the anti-virus andbad content scanners are not satisfied.

At the third stage of this example, the decomposition tree includes MIMEitem 202 and MIME headers 204, as shown in FIG. 18 a. Turning to FIG. 18b, it is seen that the unsatisfied scanners 1802 include the anti-virusscanner and the bad content scanner. The actions taken 1804 are that theanti-virus scanner scans the MIME headers item and as it is notinterested in that type reports that it is satisfied by it. The badcontent scanner scans the MIME headers item and reports that it issatisfied by it. The result 1806 is that the anti-virus scanner issatisfied by the MIME items, and the bad content scanner is notsatisfied.

At the fourth stage of this example, the decomposition tree includesMIME item 202, MIME headers 204, and HTML item 206, as shown in FIG. 19a. Turning to FIG. 19 b, it is seen that the unsatisfied scanners 1902include the anti-virus scanner (which has not examined the HTML item206) and the bad content scanner. The actions taken 1904 are that as theMIME items have not satisfied all scanners the HTML node 206 isdecomposed. The anti-virus scanner scans HTML item 206 and is satisfiedby it. The bad content scanner is not satisfied however. The result 1906is that the anti-virus scanner is satisfied by HTML item 206, and thebad content scanner is not satisfied.

At the fifth stage of this example, the decomposition tree includes MIMEitem 202, MIME headers 204, HTML item 206, and text item 210, as shownin FIG. 20 a. Turning to FIG. 20 b, it is seen that the unsatisfiedscanners 2002 include the bad content scanner. The actions taken 2004are that the bad content scanner scans text item 210 and is satisfied.The result 2006 is that the bad content scanner is satisfied by textitem 210.

The sixth stage of this example is similar to that shown in FIG. 16,although the method recursively decomposed the ZIP item 208 because at aprevious stage, at which the recursion occurs, scanners wereunsatisfied. At completion, the decomposition tree includes MIME item202, MIME headers 204, HTML item 206, text item 210, ZIP item 208,document 214, and text item 218, as shown in FIG. 21. Thus, it is seenthat at completion one more decomposition step has been avoided.

In another example, the method can scan a decomposition tree using twoscanners (not using the bad content scanner). In this example, there isno need to decompose the HTML node or the Document node and the finaldecomposition tree includes MIME item 202, MIME headers 204, HTML item206, ZIP item 208, and document 214. Two more decomposition steps havebeen avoided.

The described method is one possible way of implementing ascanner-driven model that is both simple and modular, allowing theaddition of zero or more decomposers and scanners as are required byparticular products in particular situations. A number ofimplementations of the method are possible. The present inventioncontemplates and and all such implementations.

An exemplary block diagram of an email server 2300, in which the presentinvention may be implemented, is shown in FIG. 23. Email server 2300 istypically a programmed general-purpose computer system, such as apersonal computer, workstation, server system, and minicomputer ormainframe computer. Email server 2300 includes one or more processors(CPUs) 2302A-2302N, input/output circuitry 2304, network adapter 2306,and memory 2308. CPUs 2302A-2302N execute program instructions in orderto carry out the functions of the present invention. Typically, CPUs2302A-2302N are one or more microprocessors, such as an INTEL PENTIUM®processor. FIG. 23 illustrates an embodiment in which email server 2300is implemented as a single multi-processor computer system, in whichmultiple processors 2302A-2302N share system resources, such as memory2308, input/output circuitry 2304, and network adapter 2306. However,the present invention also contemplates embodiments in which emailserver 2300 is implemented as a plurality of networked computer systems,which may be single-processor computer systems, multi-processor computersystems, or a mix thereof.

Input/output circuitry 2304 provides the capability to input data to, oroutput data from, email server 2300. For example, input/output circuitrymay include input devices, such as keyboards, mice, touchpads,trackballs, scanners, etc., output devices, such as video adapters,monitors, printers, etc., and input/output devices, such as, modems,etc. Network adapter 2306 interfaces email server 2300 withInternet/intranet 2310. Internet/intranet 2310 may include one or morestandard local area network (LAN) or wide area network (WAN), such asEthernet, Token Ring, the Internet, or a private or proprietary LANIWAN.

Memory 2308 stores program instructions that are executed by, and datathat are used and processed by, CPU 2302 to perform the functions ofemail server 2300. Memory 2308 may include electronic memory devices,such as random-access memory (RAM), read-only memory (ROM), programmableread-only memory (PROM), electrically erasable programmable read-onlymemory (EEPROM), flash memory, etc., and electro-mechanical memory, suchas magnetic disk drives, tape drives, optical disk drives, etc., whichmay use an integrated drive electronics (IDE) interface, or a variationor enhancement thereof, such as enhanced IDE (EIDE) or ultra directmemory access (UDMA), or a small computer system interface (SCSI) basedinterface, or a variation or enhancement thereof, such as fast-SCSI,wide-SCSI, fast and wide-SCSI, etc, or a fiber channel-arbitrated loop(FC-AL) interface.

In the example shown in FIG. 23, memory 2308 includes email processingsoftware 2312 and operating system 2314. Email processing software 2312includes email scanners 208, which include scanners 118A-N, decomposers120A-M, and scanner-driven decomposition routines 212, quarantinedemails 210, spam emails 212, clean emails 214, recipient inboxes 216,and, as well as additional functionality that is not shown. Emailscanners 208 automate the highlighting, removal or filtering of e-mailspam by scanning through incoming and outgoing e-mails in search oftraits typical of spam. Such scanning may include searching for patternsin the headers or bodies of messages. Each incoming email message isscanned to determine whether it is a spam email message that is to bemarked as SPAM, a dangerous spam email message that is to bequarantined, or a clean email message that is to be delivered as is tothe recipient's inbox. In addition, email scanner 208 scans the emailaddress of the sender of the email, and may also scan the first and lastname of the sender of the email. Scanners 118A-N and decomposers 120A-Mdecompose the email messages into their constituent items and scan theitems to determine their status. Each incoming email message is scannedto determine whether it is a dangerous spam email message that is to bequarantined 110, a spam email message that is to be marked as SPAM 112and delivered to the recipients inbox 114, or a clean email message 116that is to be delivered as is to the recipient's inbox 114.Scanner-driven decomposition routines control the operation of scanners118A-N and decomposers 120A-M to scan the email messages using thescanner-driven method described above. Operating system 2114 providesoverall system functionality.

As shown in FIG. 23, the present invention contemplates implementationon a system or systems that provide multi-processor, multi-tasking,multi-process, and/or multi-thread computing, as well as implementationon systems that provide only single processor, single thread computing.Multi-processor computing involves performing computing using more thanone processor. Multi-tasking computing involves performing computingusing more than one operating system task. A task is an operating systemconcept that refers to the combination of a program being executed andbookkeeping information used by the operating system. Whenever a programis executed, the operating system creates a new task for it. The task islike an envelope for the program in that it identifies the program witha task number and attaches other bookkeeping information to it. Manyoperating systems, including UNIX®, OS/2®, and Windows®, are capable ofrunning many tasks at the same time and are called multitaskingoperating systems. Multi-tasking is the ability of an operating systemto execute more than one executable at the same time. Each executable isrunning in its own address space, meaning that the executables have noway to share any of their memory. This has advantages, because it isimpossible for any program to damage the execution of any of the otherprograms running on the system. However, the programs have no way toexchange any information except through the operating system (or byreading files stored on the file system). Multi-process computing issimilar to multi-tasking computing, as the terms task and process areoften used interchangeably, although some operating systems make adistinction between the two.

It is important to note that while the present invention has beendescribed in the context of a fully functioning data processing system,those of ordinary skill in the art will appreciate that the processes ofthe present invention are capable of being distributed in the form of acomputer readable medium of instructions and a variety of forms and thatthe present invention applies equally regardless of the particular typeof signal bearing media actually used to carry out the distribution.Examples of computer readable media include recordable-type media suchas floppy disc, a hard disk drive, RAM, and CD-ROM's, as well astransmission-type media, such as digital and analog communicationslinks.

Although specific embodiments of the present invention have beendescribed, it will be understood by those of skill in the art that thereare other embodiments that are equivalent to the described embodiments.For example, the present invention may be advantageously employed inscanning outgoing email messages, as well as incoming email messages.Accordingly, it is to be understood that the invention is not to belimited by the specific illustrated embodiments, but only by the scopeof the appended claims.

What is claimed is:
 1. A method, comprising: receiving an email messagecomprising a plurality of items; scanning the email message with aplurality of scanner software, which are configured to performactivities that are based on a same decomposition tree used for scanningthe e-mail message; expanding the decomposition tree to satisfysub-trees for scanners in the plurality of scanner software; andemploying a recursive function that identifies a node in thedecomposition tree and a list of scanners with which to scan, wherein:on the root node of the decomposition tree, the list of scannerscomprises the plurality of scanners; at each iteration of the recursivefunction, each scanner in the list of scanners is called to perform ascanning operation on the e-mail message and return whether it issatisfied such that it is removed from the list of scanners; if thereare any remaining scanners in the list, the node in the decompositiontree is decomposed; and the recursive function is complete when theplurality of scanners have been removed from the list of scanners. 2.The method of claim 1, further comprising: scanning the items obtainedby each of the at least one of the plurality of scanner software withthat scanner software.
 3. The method of claim 2, wherein the determiningcomprises: determining what items of the plurality of items the emailmessage is to be decomposed into based on items of the plurality ofitems that the at least one of the plurality of scanner software iscapable of scanning.
 4. The method of claim 1, further comprising:scanning the items obtained by the at least one of the plurality ofscanner software with that scanner software.
 5. The method of claim 4,wherein the determining comprises: determining what items of theplurality of items the email message is to be decomposed into based onitems of the plurality of items that the at least one of the pluralityof scanner software is capable of scanning.
 6. The method of claim 5,wherein the plurality of scanner software comprise at least one of ananti-virus scanner, an anti-spam scanner, and a bad content scanner. 7.The method of claim 6, wherein the plurality of items of the emailmessage comprises at least one of a MIME stream, MIME headers, an HTMLitem, a ZIP item, a text item, a document, and a list of URLs.
 8. Themethod of claim 1, wherein the email messages are incoming emailmessages.
 9. The method of claim 1, wherein the email messages areoutgoing email messages.
 10. The method of claim 1, wherein each of theitems obtained from the decomposing of the email message is included asa separate node of a decomposition tree.
 11. The method of claim 10,wherein the decomposing of the email message to obtain the itemsincludes a function determining whether a current node of thedecomposition tree being processed is a type that is supported by the atleast one of the plurality of scanner software, where if the type of thenode is supported, the node is scanned, including recursivelydecomposing the node into at least one additional node of thedecomposition tree.
 12. The method of claim 1, wherein the email messageis decomposed using a decomposition tree.
 13. The method of claim 12,wherein each item in the decomposition tree is scanned by the at leastone of the plurality of scanner software that is capable of scanning theat least one type of the item.
 14. A system, comprising: a processor;and a memory element coupled to the processor, wherein the system isconfigured for: receiving an email message comprising a plurality ofitems; scanning the email message with a plurality of scanner software,which are configured to perform activities that are based on a samedecomposition tree used for scanning the e-mail message; expanding thedecomposition tree to satisfy sub-trees for scanners in the plurality ofscanner software; and employing a recursive function that identifies anode in the decomposition tree and a list of scanners with which toscan, wherein: on the root node of the decomposition tree, the list ofscanners comprises the plurality of scanners; at each iteration of therecursive function, each scanner in the list of scanners is called toperform a scanning operation on the e-mail message and return whether itis satisfied such that it is removed from the list of scanners; if thereare any remaining scanners in the list, the node in the decompositiontree is decomposed; and the recursive function is complete when theplurality of scanners have been removed from the list of scanners. 15.The system of claim 14, wherein the system is operable such that theitems obtained by each of the at least one of the plurality of scannersoftware are scanned with that scanner software.
 16. The system of claim15, wherein the determining comprises determining what items of theplurality of items the email message is to be decomposed into based onitems of the plurality of items that the at least one of the pluralityof scanner software is capable of scanning.
 17. The system of claim 14,wherein the system is operable such that the items obtained by the atleast one of the plurality of scanner software are scanned with thatscanner software.
 18. The system of claim 17, wherein the determiningcomprises the processor for determining what items of the plurality ofitems the email message is to be decomposed into based on items of theplurality of items that the at least one of the plurality of scannersoftware is capable of scanning.
 19. The system of claim 18, wherein theplurality of scanner software comprises at least one of an anti-virusscanner, an anti-spam scanner, and a bad content scanner.
 20. The systemof claim 19, wherein the plurality of items of the email messagecomprises at least one of a MIME stream, MIME headers, an HTML item, aZIP item, a text item, a document, and a list of URLs.
 21. The system ofclaim 14, wherein the email messages are incoming email messages. 22.The system of claim 14, wherein the email messages are outgoing emailmessages.
 23. A computer program product embodied on a tangiblenon-transitory computer readable medium for performing operations,comprising: receiving an email message comprising a plurality of items;scanning the email message with a plurality of scanner software, whichare configured to perform activities that are based on a samedecomposition tree used for scanning the e-mail message; expanding thedecomposition tree to satisfy sub-trees for scanners in the plurality ofscanner software; and employing a recursive function that identifies anode in the decomposition tree and a list of scanners with which toscan, wherein: on the root node of the decomposition tree, the list ofscanners comprises the plurality of scanners; at each iteration of therecursive function, each scanner in the list of scanners is called toperform a scanning operation on the e-mail message and return whether itis satisfied such that it is removed from the list of scanners; if thereare any remaining scanners in the list, the node in the decompositiontree is decomposed; and the recursive function is complete when theplurality of scanners have been removed from the list of scanners. 24.The computer program product of claim 23, further comprising computercode for scanning the items obtained by each of the at least one of theplurality of scanner software with that scanner software.
 25. Thecomputer program product of claim 24, wherein the determining comprisesdetermining what items of the plurality of items the email message is tobe decomposed into based on items of the plurality of items that the atleast one of the plurality of scanner software is capable of scanning.26. The computer program product of claim 23, further comprisingcomputer code for scanning the items obtained by the at least one of theplurality of scanner software with that scanner software.
 27. Thecomputer program product of claim 26, wherein the determining comprises:determining what items of the plurality of items the email message is tobe decomposed into based on items of the plurality of items that the atleast one of the plurality of scanner software is capable of scanning.28. The computer program product of claim 27, wherein the plurality ofscanner software comprises at least one of an anti-virus scanner, andanti-spam scanner, and a bad content scanner.
 29. The computer programproduct of claim 28, wherein the plurality of items of the email messagecomprises at least one of a MIME stream, MIME headers, an HTML item, aZIP item, a text item, a document, and a list of URLs.
 30. The computerprogram product of claim 23, wherein the email messages are incomingemail messages.
 31. The computer program product of claim 23, whereinthe email messages are outgoing email messages.